6 research outputs found
Eunomia: Enabling User-specified Fine-Grained Search in Symbolically Executing WebAssembly Binaries
Although existing techniques have proposed automated approaches to alleviate
the path explosion problem of symbolic execution, users still need to optimize
symbolic execution by applying various searching strategies carefully. As
existing approaches mainly support only coarse-grained global searching
strategies, they cannot efficiently traverse through complex code structures.
In this paper, we propose Eunomia, a symbolic execution technique that allows
users to specify local domain knowledge to enable fine-grained search. In
Eunomia, we design an expressive DSL, Aes, that lets users precisely pinpoint
local searching strategies to different parts of the target program. To further
optimize local searching strategies, we design an interval-based algorithm that
automatically isolates the context of variables for different local searching
strategies, avoiding conflicts between local searching strategies for the same
variable. We implement Eunomia as a symbolic execution platform targeting
WebAssembly, which enables us to analyze applications written in various
languages (like C and Go) but can be compiled into WebAssembly. To the best of
our knowledge, Eunomia is the first symbolic execution engine that supports the
full features of the WebAssembly runtime. We evaluate Eunomia with a dedicated
microbenchmark suite for symbolic execution and six real-world applications.
Our evaluation shows that Eunomia accelerates bug detection in real-world
applications by up to three orders of magnitude. According to the results of a
comprehensive user study, users can significantly improve the efficiency and
effectiveness of symbolic execution by writing a simple and intuitive Aes
script. Besides verifying six known real-world bugs, Eunomia also detected two
new zero-day bugs in a popular open-source project, Collections-C.Comment: Accepted by ACM SIGSOFT International Symposium on Software Testing
and Analysis (ISSTA) 202
CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pre-trained Models
Code generation models based on the pre-training and fine-tuning paradigm
have been increasingly attempted by both academia and industry, resulting in
well-known industrial models such as Codex, CodeGen, and PanGu-Coder. To
evaluate the effectiveness of these models, multiple existing benchmarks are
proposed, including only cases of generating a standalone function, i.e., a
function that may invoke or access only built-in functions and standard
libraries. However, non-standalone functions, which typically are not included
in the existing benchmarks, constitute more than 70% of the functions in
popular open-source projects, and evaluating models' effectiveness on
standalone functions cannot reflect these models' effectiveness on pragmatic
code generation scenarios.
To help bridge the preceding gap, in this paper, we propose a benchmark named
CoderEval, consisting of 230 Python and 230 Java code generation tasks
carefully curated from popular real-world open-source projects and a
self-contained execution platform to automatically assess the functional
correctness of generated code. CoderEval supports code generation tasks from
six levels of context dependency, where context refers to code elements such as
types, APIs, variables, and consts defined outside the function under
generation but within the dependent third-party libraries, current class, file,
or project. CoderEval can be used to evaluate the effectiveness of models in
generating code beyond only standalone functions. By evaluating three code
generation models on CoderEval, we find that the effectiveness of these models
in generating standalone functions is substantially higher than that in
generating non-standalone functions. Our analysis highlights the current
progress and pinpoints future directions to further improve a model's
effectiveness by leveraging contextual information for pragmatic code
generation
Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs
Lightweight static bug-detection tools such as FindBugs, PMD, Jlint, and Lint4j detect bugs with the knowledge of generic bug patterns (e.g., objects of java.io.InputStream are not closed in time after used). Besides generic bug patterns, different projects under analysis may have some project-specific bug patterns. For example, in a revision of the Xerces project, the class field “fDTDHandler ” is dereferenced without proper null-checks, while it could actually be null at runtime. We name such bug patterns directly related to objects instantiated in specific projects as Project-Specific Bug Patterns (PSBPs). Due to lack of such PSBP knowledge, existing tools usually fail in effectively detecting most of this kind of bugs. We name bugs belonging to the same project and sharing the same PSBP as sibling bugs. If some sibling bugs are fixed in a fix revision but some others remain, we treat such fix as an incomplete fix. To address such incomplete fixes, we propose a PSBP-based approach for detecting sibling bugs and implement a tool called Sibling-Bug Detector (SBD). Given a fix revision, SBD first infers the PSBPs implied by the fix revision. Then, based on the inferred PSBPs, SBD detects their related sibling bugs in the same project. To evaluate SBD, we apply it to seven popular open-source projects. Among the 108 warnings reported by SBD, 63 of them have been confirmed as real bugs by the project developers, while two existing popular static detectors (FindBugs and PMD) cannot report most of them
Inferring Dependency Constraints on Parameters for Web Services
Recently many popular websites such as Twitter and Flickr expose their data through web service APIs, enabling third-party organizations to develop client applications that provide functionalities beyond what the original websites offer. These client applications should follow certain constraints in order to correctly interact with the web services. One common type of such constraints is Dependency Constraints on Parameters. Given a web service operation O and its parameters Pi, Pj,…, these constraints describe the requirement on one parameter Pi that is dependent on the conditions of some other parameter(s) Pj. For example, when requesting the Twitter operation “GET statuses/user_timeline”, a user_id parameter must be provided if a screen_name parameter is not provided. Violations of such constraints can cause fatal errors or incorrect results in the client applications. However, these constraint